textureless rendering

=programming =graphics =algorithms

Current 3d rendering works as follows:

There's an array of triangles.

Each triangle has pointers to 3 elements of a vertex array.

Each vertex has a texture position and possibly extra data.

The pixels in the triangle are found.

The triangle position at each pixel is tested against the z-buffer to check if it's visible.

For each pixel, the texture position and other data are interpolated between the triangle vertices.

The interpolated texture position is used to access one or more textures. The closest pixels in the textures to the desired point are interpolated.

The interpolated texture data is input to a shader.

What if you just put the texture data in the vertex? People do that! Using vertex colors that way has been somewhat common in computer graphics, and is known to have better performance. The main disadvantage is that resolution is lower. But what if you just use more vertices? If you have a vertex per pixel, then vertex colors have as much resolution as texture lookups. Why not do that?

That used to be too many triangles for GPUs. Now, it isn't.

Animation with that many vertices gets computationally expensive. But we can solve this by animating models with fewer vertices, then doing polygon tessellation. This is often called micropolygon rendering.

Micropolygons are sometimes used with displacement mapping, which replaces normal maps with textures that indicate vertical offsets. Each micropolygon has a lookup into the displacement map texture, with the texture position interpolated from its parents. The micropolygons generated by tessellation don't have vertex data. All they have is interpolated data from their parent vertices. But why not give them vertex data with colors and displacement, instead of using textures?

Obviously it's easier to add displacement maps to existing models, but in theory, replacing textures with micropolygons with vertex colors should be a viable system.

The existing texture-based systems work very well. Why is there any need to consider alternate systems? Unreal had similar thoughts behind its creation of Nanite, so we can see what they have to say.

That doesn't use tesselation; instead, it has very high poly count model with hierarchical levels of detail. For each piece of a model, Nanite determines if it's visible using occlusion geometry, determines how large it appears, and then chooses a model version with the appropriate number of polygons.

As that Nanite page notes, normal maps take more data than enough polygons to make them unnecessary. Modern games take a lot of storage space, and this is primarily because of high-resolution textures. Reducing the data used for 3d models could significantly reduce download and loading times.

You also, of course, get the advantages of a real high-poly model over normal maps: the fake displacement of normal maps looks wrong if you look closely from a steep angle, and doesn't cast shadows correctly.

Nanite is also in some ways easier for artists. A typical art workflow for static environments involves:

- sculpting a high-poly untextured model
- polygon reduction
- UV unwrapping
- texturing

With Nanite, you can use the high-poly sculpt directly, eliminating the polygon reduction step, which is fairly automated these days but still saves some time.

But the Nanite approach has some disadvantages compared to tessellation. A big one is that Nanite does not support animation with bones or blend shapes. It also uses more data: a regular vertex has XYZ positions, while a generated vertex only needs displacement. (Also, because you can store displacement as a % of the distance between parent vertices, you can use fewer bits for it, but this isn't really necessary.)

In addition to eliminating the polygon reduction step, I also want to eliminate the UV unwrapping step. One approach to doing this is "mesh colors". That uses vertex colors to generate textures for models. This has a performance penalty, but has some potentially significant advantages for artists.

Why can't we just use vertex colors with Nanite instead of textures? Well, we can, but that doesn't get the full advantages of mipmaps. Suppose you have a colorful painting, and you zoom out until it's a single pixel on the screen. The correct color for that pixel is the average of whole painting, not a random pixel from the high-resolution version. Mipmaps store pre-averaged colors that make this easy. Nanite has multiple levels of detail, but that's not enough to replace mipmaps.

The viability of mesh colors suggests a possible solution: generate pseudotextures dynamically. So, to render a model, choose a LOD that has about one vertex per pixel. In general, some areas will then have several vertices per pixel. So, first render that model LOD at a higher resolution than the display resolution. Some blocks of this render might have higher resolution than others, depending on the vertex density in that block. Then, use that render as a texture, blur the texture, interpolate the vertex data at each vertex location in screen space, and cache that interpolated data in the vertices.

That approach should result in perfect anisotropic filtering if the texture matches the model orientation. It's desirable to only recalculate the cached values occasionally, so there would be some mismatch. But calculating this data is relatively fast: while the resolution is higher, no lighting or shaders are applied, so rendering would be equivalent to that of an unlit vertex colored model, which is very fast. If these cached values are calculated from 16x the pixels of the base resolution, and recalculated every 16 frames, the performance cost would be relatively small, and the quality should be good enough. But we can do better, by prioritizing recalculation of cached values for models with the greatest change in orientation relative to the camera.

In terms of data layout, the above approach might look like this:

A mesh has the following arrays:

- parent triangles: indices for 3 parent vertices, indices for 3 subvertices, a pointer to a vertex data array, 4 subtriangle indices
- subtriangles: indices for 3 subvertices, 4 subtriangle indices
- parent vertices: XYZ positions, a vertex data index, cached vertex data
- subvertices: displacement, a vertex data index, cached vertex data

Vertex data arrays would be shared across models. A vertex data element is accessed with the pointer in a triangle and an index in the vertex. Different vertex data arrays may have different amounts of data per vertex; a typical element might include 7 numbers: RGB, specularity, RGB emission. The cached vertex data has the same size as the source vertex data, but is unique to the mesh rather than shared.

The pointers to vertex data arrays are per-triangle instead of per-mesh so that multiple arrays of vertex data can be used in a single mesh, to enable better reuse across models. The cached vertex data stored at each vertex element would then have the largest element size of any vertex data arrays used by the mesh.

I think this is a viable rendering system! Maybe Epic or Microsoft or AMD or NVIDIA should give it a try!